NES DTE - You can do it, we can help

NOTE: I have updated this document in June 2008, re-writing a few parts I thought were bad. I have recreate the screenshots for the modified steps.
I have darkened the depreciated instructions.

DTE, or Dual Tile Encoding is a method to compress data. Usually a ROM hacker would use this method to fit more text into a video game ROM than was originally intended. This is very useful when translate Japanese games (which can say more stuff with fewer characters) to a language such as English (which is comparatively wordy, thus needing extra ROM space to account for the wordlyness).
What DTE does is it takes advantage of the fact that English has a shorter character set (or "alphabet" as some of you might think to call it). English must need a good 60 or more fewer characters than a Japanese hiragana+katakana character set. Each character is represented by a different hexadecimal value in the ROM. So, typically, after a translation ROM hacker has finished changing the Japanese kana font into English letters, there are often many hex values that are never used, since English doesn't need that many different hex values to represent the different possible characters. What DTE does is use the unused portion of the font table as a signal to print two letters instead of one, thus increasing the amount of text that can be entered into the ROM.

Although it is nice, it also requires that the game be programmed to support such a compression technique. Usually, that requires somebody knowledgable in 6502 assembly (or ASM) to find and modify the game's code to support such a technique. I've done some work before, and I noticed I was often re-writing the same code, replacing a few game-specific variables each time. So know, I'll show you a common technique of how I find and create a typical ROM hack to allow DTE support.

Required Prior Knowledge

- Hexdecimal, and how to do basic hex editing.
- Pointers (what they are).
- Kana (since I'm doing a Japanese example)

Tools For the Job

- FCEUX, currently the greatest debugger NES emulator available.

- WindHex, Bongo`'s hex editor for Windows. Check his documents page, download the file, rename it to Winhex.exe, run it to install. The only hex editor I know of with a kana search option.

- Ys III: Wanderers from Ys (NES ROM. It is not provided and will not be provided no matter how much you have trouble finding it or how badly you need it to pay off the hitmen on your tail). This is the example game.

Basic Rundown of the Routine

- Find the text in the ROM (I assuming you know the steps to doing this).
- Find the text in the RAM
- Find the ASM to load the text
- Finding available RAM space
- Inserting the code

STEP 1: Finding the text in the ROM

NOTE: I won't be going over the kana chart or how to build the table. I'll assume you know how to do that already.

First, we'll need a sample line of text. Let's play the game until we find some text.

ys3game.PNG (14636 bytes)

Okay, let's try to find this line of text in the ROM. We've already made our font table, so let's open WinHex.

Kana!!
You'll want to go to View, and set "View Text Data as Unicode", and it also helps to go to Option->Set Font->Text Font, and set it to like 12 to get visible kana.

Now, choose Search->Search Kana, or press Ctrl-K.

Now click on the empty white box on the right. Select the "Hiragana character set". Let's look for Stupid old-ass FrontPage won't let me type it..


And now click Find.


We find that this line starts of text starts at 2BBC1. We will need that information shortly.

Step 2: Finding the text in CPU address form

Okay, we know where the text is in ROM, but since we are going to be intercepting a value we don't need in English, and changing it to one of two values we need for English, we will need to know where in address space the text is.  While a save state might help (if you happen to know the format your favorite emulator uses), it is reasonably easy to find this in address space while running FCE Ultra (assuming the text reading code wasn't programmed by aliens).

However, you will need a small lesson on the NES' CPU (don't worry, it's not much).

The NES can address 65,536 addresses, numbered 0 to FFFF (this is called address space). The last half of these (8000 to FFFF, totalling 32KB) are used to access ROM. Now, in the days of the original Super Mario Bros., that was enough to hold the entire ROM (specifically, the program or PRG (meaning all code and non-graphics data) ROM) in memory (many games also used a seperate ROM for graphics, called a character or CHR ROM, but that's beside the point of this article). But, soon after, the NES game designers realized that wasn't enough. So, they designed special hardware in the cartridges that would break the ROM down into pieces (those pieces are called "banks"), and would direct the CPU to those parts.

However, the size of those banks was not the same for every game. Some games were fine with having 16KB banks (being able to fit two non-contiguous banks of the ROM into address space at once), others opted for 8KB banks (having four non-connected banks of the ROM loaded into the address space at one time).

Why you need to know this is because you cannot tell where in address space the text will be just from looking at the ROM.

Here's a small chart of which parts of address space are used by each bank.

ROM bank size Loaded bank 1 accessed by reading Loaded bank 2 accessed by reading Loaded bank 3 accessed by reading Loaded bank 4 accessed by reading
32KB (8000h bytes) 8000-FFFF Not Available Not Available Not Available
16KB (4000h bytes) 8000-BFFF C000-FFFF Not Available Not Available
8KB (2000h bytes) 8000-9FFF A000-BFFF C000-DFFF E000-FFFF

(Hopefully I can tie all these numbers together by the end of this section if you're confused)

Okay, now that you know how large and where each ROM bank is it is important to look at where your text is relative to the start of the ROM bank it resides in. Let's have another look at our text.

Some hex math would tell us each bank is a multiple of 1000, and all can be divided equally into 10000. So, that would mean we can look at just the 1000s digit of our hex to determine where a bank starts.

ROM Bank Size A bank starts at each 1000s value of:
32KB 0000, 8000
16KB 0000, 4000, 8000, C000
8KB 0000, 2000, 4000, 6000, 8000, A000, C000, E000

But if I haven't put you to sleep yet, you're probably now wondering: okay, quite the technical crap and tell me which I need to know for the game.

Fortunately, FCEUltra tells us which bank size the game uses. Let's load up the game, the click Help->Message Log...

messlog.PNG (15688 bytes)
The important part of this is the second value after "PRG ROM". It tells us that the game uses 16KB RAM banks.

So we see we have the ROM address: 2BBE2.
We see 16KB banks start at ?0000, ?4000, ?8000, ?C000 in the ROM (? representing any 10000s digit).
We see that the value closest to, but lower than 2BBE2 is 8000, so this ROM bank begins at 28000.

Now, where in the address space is 2BBE2? Let's use our Windows Calculator to find out.
Open Calculator. If it is set to Standard mode, click View->Scientific. Click Hex.

Calculator, yes it is.
Type 2BBE2 - 28000. Press Enter.

More Calculator!!
We get 3BE2. Now we know that this end-of-text message "pointer" will be at 3BE2h bytes after the start of one of the banks in address space.

So, in a ROM with 16KB banks (ROM banks are stored in address space beginning at 8000 and C000),
that means we'll be looking for either BBE2 (8000+3BE2) or FBE2 (C000+3BE2).

Okay, we know the ROM address, and we know the possible CPU addreses that represent the ROM address.
Load the ROM up, then click Tools -> Debug.

Now look towards the upper-right, and you'll see something called "Breakpoints".
Set one, and you can make the emu automatically pause whenever a specific address of CPU or PPU RAM is read or written to, or
used to execute code.
We know that $BBB1 or $FBB1 is the CPU address that reprents our text. So, let's use a breakpoint to find the code that reads it.
SECRET: For 16KB, the second CPU bank is almost always locked to the last PROGRAM ROM (or PRG-ROM) bank.
So, if you can do hex math: convert the PRG-ROM size to hex (256KB = $40000 bytes). Subtract 16KB ($4000). That equals $3C000.
If the ROM address (without header) >= $3C000, use the second CPU address calculated, otherwise use the first.
Click the Add... button.

We leave Memory at CPU, since were reading ROM (or the NES' main RAM). We would've clicked PPU if we wanted to check VRAM.
We want to find code that READS the ROM, so we click "Read". Now, we enter "BBB1", and click OK. (if you want, you can repeat for $FBB1).
Now click Run and go back to the game. Play until you reach the pointer where the text is displayed.
(the game stops before the title, and opens the code window with a line ending in "= 83", but I know the value it should've read is $47. So,
this isn't the result I want. I click Run to continue. It goes on, I press Start to begin the game, and Adol goes to talk to the guard.
The debugger appears with this:
(image clipped to show relavent info only)

The top line is one that resulted in reading the text data. It returns a $47, which is the value I was expecting it to read, so this must be the code.
Most often, the instruction will be LDA ($nn),y.

However, if it is not, it may possibly be a hard-coded pointer (with an instruction like LDA $BBB1,x). Such as if the game stored text at 9700 in address space, the game may actually instruct the NES to load explicitly from BBB1, in which you may need to do a different routine for each string. However, I don't imagine this would happen unless the game had very little text (in which case I imagine you may be better using uncompressed text. If there's a demand and a need, maybe I could try to get an example done, but somehow I suspect this may need to be specially written for the game).
If you care to know what that ASM statement means, it is a command telling the NES to load a number into the NES' Accumulator, which is basically the NES hexadecimal calculator. The rest of the statement tells the NES where and how to calculate the number.
First, the NES should take the hexadecimal value 9B, and add Y to that ("Y" is a special memory address called a register that the NES' CPU has reserved for special functions to modify statements). Y is usually set to 0 (by the statement LDY #$00, which means load the hexadecimal number 0 into Y).
So, now we have 9B+0=9B. Now the NES looks at 9B, and sees that 9B and 9C (it knows that this statement implies it is looking for a 2-byte value) and sees the value BBE3.
It knows this to read the value from that CPU address and store the result in register A.
Note that if you don't find a LDA ($something),Y command that works, you might find a LDA ($something,X) command. This is very similar to the above command except that the game looks for the pointer at something in the 0000-00FF page of RAM, instead of at something+Y on the 0000-00FF page.
In the case you find this statement to be used, you should replace LDA ($something),Y with LDA ($something,X) for the rest of the steps.
If neither works, you may search for hard-coded pointers (which would involve an ASM code of something like LDA $pointer,X. You can use LDA $pointer, X in place of the LDA ($9B),Y command, but be warned that a routine on a hard-coded pointer will likely work only on that text string.

LDA ($something,X) is A1 something (something which will be 1 byte long).

After we've got the Debugger stopped in Step 2,
immediately click on Tools->Hex Editor.

By default, you should be looking at the CPU address space. Now, noting the address of the instruction that stopped the debugger ($8288), scroll down to there
in the hex editor. Now, right click on the contents of that address, then click "Go here in ROM file". The hex editor will switch to the ROM and point you
to address 28298.

Step 3: Finding available RAM space

(NOTE: For those interested, I'll try to start explaining the ASM codes as they come up. Those will be in green boxes if you want to skip them.)

Now, we are going to want to find some empty space in ROM that is accessible right when the text is read. While the debugger has still paused the game, we can search.

NOTE: Only bother searching between $8000 and $FFFF. Some of the rest of the space is used for system and cartridge RAM, a few are used for I/O purposes, but most of it is undefined.

We are going to want to look for a large chunk of empty ROM space, specifically about 300 bytes or more (about a page and a half or more of 00s or FFs are usually a good clue). We see some around 9650 to 9DFF. (just use FCEUXD SP's hex editor. Right click on an available space in the CPU (address $8000+) and click "Go to here in ROM".)

So, let's make a jump from the original code to here.
You'll also want to look for a place in ROM for a DTE table to go. You'll want one long block of empty space (at least (number of DTE pairs desired x 2) of connected ROM space, or two seperate blocks of space with the desired number of DTE pairs available in each chunk. For this game, $9700 (ROM $29710) looks like a good spot for this.

Go back to 28296, and type 20 50 96 EA (JSR $9650    NOP).

JSR (20) is a code to signify that we want to jump to a different address in RAM, but that we want the NES to leave a record of where we jumped from, so that we only need an RTS (Return from Subroutine) command because we want to go back to the original code.

JMP (4C) is a code to signify that we want to jump to a different address in RAM, but we do NOT want to have the NES leave a record of where we jumped from, since we don't intend to quickly back there once the current routine is done. Then you must end the hack routine with another JMP command.

NOP (EA) is a code to do nothing (No operation). We need this because we were replacing 4 bytes of ASM with 3 bytes of code. After an RTS the game will jump back to the byte immediately following the JSR command. 9B 20, instead of 20 29 8A (which is a completely different command and will surely crash the game). So, we put in NOP since it is only 1 byte and does nothing, therefore the game will not execute unintended operations that lead to a crash.

Now, one thing I would suggest you do is try to play the game for awhile with the memory viewer open, so you can try to spot an unused byte of RAM (one that seems to always be 00 is a good choice). You'll need a spare byte of RAM for the DTE hack to be successful. One you've found one go to our DTE hack and type:

NOTE: System RAM is address $0000-07FF. If the game uses it, cartridge RAM is usually $6000-7FFF. But some cartridge mappers can disable it. So avoid using it if you can.
For our purposes assume all other space is undefined.

E6 low-byte if it is found on between 0000 and 00FF (I think CF is a possible free byte, so I'll try E6 CF)
EE low-byte high-byte if it is found at 0100 or greater.

INC is used to increase a value in an address in RAM by 1. FF+1 = 00.
Use E6, and a single byte if you are increasing a value between 0000 and 00FF.
Use EE, followed by two bytes if you are increasing a value at address 0100 or higher.

Now try playing the basic gameplay awhile (talk to some villagers, buy some items, kill a few monsters). Nothing seems broken? Okay then.

POSSIBLE SIDE TESTS:

One more step you may want to do is go to the RTS (60) in the routine, and try a INX (E8) command too. The Y value shouldn't be changed in this routine, but the X value (which is also a memory address that the CPU reserves for special functions) will be, so we will want to test to see if changing the X register corrupts the game.

So, now we should A0 00 B1 9B E8 60 entered at 29660. Again, sample the basic gameplay to see if anything breaks (well, I imagine only the text would likely break if anything does). Note that if the X register does need to be unaltered, you will want to secure an additional free RAM byte to hold our DTE tile (we are going to back up the X value, read a byte of the DTE table, save the byte of the DTE table memory, then restore the back-up of X).

We may also want to a simple text replacement exercise:
To test if our routine will only affect a letter, and no other codes, let's do a simple code that changes a specific character.
Go back to
LDY #$00
LDA ($9B),Y
RTS
(A0 00 B1 9B 60) in our sample.

Now, we may want to try
LDY #$00
LDA ($9B),Y
CMP #$47        (this command checks to see if we read a katanana GA (hex 47) in the script, which is the
                        first letter of our message)
BNE $02           (if it isn't, skip the next two bytes, which are the next command)
LDA #$80        (load a hiragana a (hex 80))
RTS

If done right, this should change any katakana GA's into hiragana a's and make no other changes to the text.
If that is successful, we can be sure it is safe to go on.

Step 4: Inserting the code

Now I'm going to give you the code I use, and explain what each thing does.

ASM CODE HEXADECIMAL EXPLANATION
This code should begin at 29660 in the ROMfile. See ASM code descritions at the bottom if interested.
LDA $freeRAMbyte A5 freeRAMbyte (if address 1 byte long)
AD freeRAMbyte (if address 2 bytes long)
Load a value from a RAM offset into register A.
CMP #$00 C9 00 Compares the value in A to the hexadecimal  number "0".
(comparing A to 0 is unnecessary if A was used for the previous instruction)
BNE 2ndtime D0 Number of bytes (counting the byte after the D0 xx command as "0") to reach the code for the "2ndtime" routine. Makes a jump if freeRAMbyte does not equal 0.
FreeRAMByte should equal 0 the first time we run this, so we should immediately proceed to check if the byte of script is within the range of hexadecimal values we intend to use (for this example, we'll use 15 to 7F as our DTE font range)
LDY #$00 A0 00 Preparation for next step. See Step 3 for explanation.
LDA ($9B),Y B1 9B Read a byte of the script data. Again, see Step 3 for an explanation of this command.
CMP #$14 C9 14 This is checking to see if the value read from the script is greater (BPL does not include equal to) than $14.
BPL NextCheck 10 Number of bytes to reach NextCheck (see BNE command above)
This is the Abort routine. The game should go here if the byte of the script read is NOT GREATER THAN 14 (meaning 14 or less). This is because I chose to have 15 as the lowest value in the font table to represent a DTE value, and I chose that so that my code doesn't modify control codes that the game needs in order to work.
LDA ($9B),Y B1 9B Reload the original script value.
RTS 60 Go back to the original code without changing the script.
Here is the NextCheck routine. The last routine checked to see if we went below the low end of the DTE table. This one will see that we do not exceed the high end of the table.
LDA ($9B),Y B1 9B You know this code by know. I believe this is an unnecessary step actually when it is performed right after a JUMP/BRANCH, but better safe than sorry.
CMP #$7F C9 7F Check that the value in the script does not exceed the highest value in the DTE table.
BPL Abort 10 Number of bytes to reach Abort Note, to go backwards, count the 10 in this command as "FE", and each decrease in the count byte makes you go back one byte further in RAM.
Now, we should be within the sample DTE table range of 15-7F, and we should be preparing to write the first of the two tiles.
INC freeRAMbyte E6 freeRAMbyte (if address is 1 byte long) or
EE freeRAMbyte (if address is 2 byte longs)
We will use our freeRAMbyte as a "flag" to set if we should be one the first/only letter, or if we've confirmed a second letter is necessary.
Making freeRAMbyte not equal 0 signifies UseDTE=yes.
Note that the gray lines will only be necessary if we determined in step 4 that changing the X register corrupts the game. If it does, we will want to preserve its contents. This code will quickly copy the contents to the X register to a special very temporary area of memory called the stack. It is most notably used in JSR functions to record where the game needs to return to when it hits a RTS, so we will want to be sure not to leave this routine until we return the Stack to it's previous content.
TAX 8A Only values in the accumulator (the NES' caluclator unit basically) can be backed up. This copies the X value to the accumulator.
PHA 48 This "pushes" the A value (which is a copy of X now) onto/into our temp. RAM space.
LDA ($9B),Y B1 9B -
SEC 38 Preperation for a subtraction command.
SBC #$15 E9 15 Since hex value 15 = DTE pair #00, hex 16 = DTE pair #01, etc. we need to subtract the value of the lowest valid use-DTE-compression hex.
ASL A 0A Take the result of our subtraction (which is stored in the "accumulator" and double it). This is done since DTE pair #00 uses bytes 0 and 1 in the DTE table, DTE pair #01 used bytes 2 and 3, DTE #02 uses bytes 4 and 5, etc.
TAX AA Transfer that result to the X register (and special variable you'll see used in the next command).
Now, you'll want to use the RAM you speficied as a DTE table previously. Specify the address of the first byte of the first DTE table, and the math (DTE index x 2) should take care of the rest.
Alternately, if you chose to make a seperate table for the first letter of each tile, and a sperate table for the second letter of each tile, you'd skip the ASL A (0A) command. (You might want to do this, since you can have only 256 connected bytes, so one table will allow you 128 pairs, while using two seperate tables will allow you 256 pairs)
LDA DTETblOffset,X BD 2-byte DTE Table Offset Reads a byte of the table
Only do the gray lines  if you determined backing up the X register was necessary. We will now save this value we took of the DTE table, and reload X to its prior contents.
STA DTEBackUpByte 85 DTEBackUpByte (if address is 1 byte long)
8D DTEBackUpByte (if address is 2 bytes long)
Save the value we read from our DTE table to RAM.
PLA 68 The PHA saved a copy of the X register that last time we used it. This command undoes that, and loads that value back into the accumulator.
TAX AA Again, moves the accumulator back to X.
LDA DTEBackUpByte A5 DTEBackUpByte (1-byte address)
AD DTEBackUpByte (2-byte address)
 
RTS 60 We've got our the value of the first tile, so let's go back and print it.
This is the 2ndtime routine. We'll go here when he have determined a second tile is necessary.
DEC freeRAMbyte C6 freeRAMbyte (if address is 1 byte long)
CE freeRAMbyte (if address is 2 bytes long)
Should return our UseDTE flag to "NO", so that the following byte can be processed properly.
DEC $9B C6 9B Decrease the value at $9B (our script pointer), causing the game to re-read the same byte it did on the previous tile.
LDA $9B A5 9B Read the value stored at $009B.
CMP #$FF C9 FF Check to see if the low byte of the script pointer has been changed from 00 to FF. If it hasn't, skip 2 bytes (those 2 bytes are the next command).
BNE $02 D0 02
The NES does not know that you are changing a value that is intended as a 2-byte value. It thinks you are changing a single independant byte. 00 - 1 = FF. As the NES is ignoring the high byte, you could cause an incorrect equation like 0200 - 1 = 02FF. The above couple of lines check for this, and the below one fixes it.
DEC $9C C6 9C Decreases the high byte of the pointer if necessary.
Again, only do the gray lines if you did them last time.
TXA 8A  
PHA 48  
LDY #$00 A0 00 -
LDA ($9B),Y B1 9B -
SEC 38 Preperation for the next line.
SBC #$15 E9 15 Because hex 15 = DTE 0.
ASL A 0A Use only if you store both 1st and 2nd tile in the same table.
TAX AA Again, turns your result into which element of the DTE table to use.
LDA DTETableOffset+1,X BD One byte higher than the starting offset of the DTE table (If using 1 table, since you want to use the odd-numbered bytes in the table (last time you used the even-numbered bytes) Use LDA DTE2ndTableOffset,X if you are using a second table.
More gray lines. If they weren't used on the first time, don't use them this time.
STA DTEBackUpByte 85 DTEBackUp (1-byte address)
8D DTEBackUp (2-byte address)
 
PLA    
TXA    
LDA DTEBackUpByte A5 DTEBackUp or AD DTEBackUp (you know the drill)  
RTS 60 Whoosh! End.

Now, let's put in the DTE values and check if they worked.
For the example, I would just go to $29710 and put in (using the text field of the hex editor, not the data field) 1516171819... and try playing the game. If we get random hex values in the text, we have succedded.

dtetbl.PNG (26102 bytes)

Now, let's check it out.

ys3game.PNG (14636 bytes)Makes great 13375P34|< !!
Tada! Note that the fact that substrings work is not always expected. Some games may use a seperate code to do that. In that case, you just repeat the procedure (looking for the "pointer" to the last byte of the last subtring printed on a screen of text).

Oh, yeah, that's right. I almost forgot to give my finished code, to compare with yours:

The Hexadecimal:

A5 CF C9 00 D0 1E A0 00 B1 9B C9 14 10 03 B1 9B
60 B1 9B C9 7F 10 F7 E6 CF B1 9B 38 E9 15 0A AA
BD 00 97 60 C6 CF C6 9B A5 9B C9 FF D0 02 C6 9C
A0 00 B1 9B 38 E9 15 0A AA BD 01 97 60

The Source:

LDA $CF
CMP #$00 (again, I'm including this instruction for clarity, but CMP #$00 after an instruction using A is unneeded.)
BNE $1F
LDY #$00
LDA ($9B),Y
CMP #$14
BPL $03
LDA ($9B),Y
RTS

LDA ($9B),Y
CMP #$7F
BPL $F7
INC $CF
LDA ($9B),Y
SEC
SBC #$15
ASL A
TAX
LDA $9700,X
RTS

DEC $CF
DEC $9B
LDA $9B
CMP #$FF
BNE $02
DEC $9C
LDY #$00
LDA ($9B),Y
SEC
SBC #$15
ASL A
TAX
LDA $9701,X
RTS